Goto

Collaborating Authors

 unknown space



Efficient Navigation in Unknown Indoor Environments with Vision-Language Models

Schwartz, D., Kondo, K., How, J. P.

arXiv.org Artificial Intelligence

We present a novel high-level planning framework that leverages vision-language models (VLMs) to improve autonomous navigation in unknown indoor environments with many dead ends. Traditional exploration methods often take inefficient routes due to limited global reasoning and reliance on local heuristics. In contrast, our approach enables a VLM to reason directly about occupancy maps in a zero-shot manner, selecting subgoals that are likely to yield more efficient paths. At each planning step, we convert a 3D occupancy grid into a partial 2D map of the environment, and generate candidate subgoals. Each subgoal is then evaluated and ranked against other candidates by the model. We integrate this planning scheme into DYNUS \cite{kondo2025dynus}, a state-of-the-art trajectory planner, and demonstrate improved navigation efficiency in simulation. The VLM infers structural patterns (e.g., rooms, corridors) from incomplete maps and balances the need to make progress toward a goal against the risk of entering unknown space. This reduces common greedy failures (e.g., detouring into small rooms) and achieves about 10\% shorter paths on average.



Towards Efficient Occupancy Mapping via Gaussian Process Latent Field Shaping

Gentil, Cedric Le, Pradalier, Cedric, Barfoot, Timothy D.

arXiv.org Artificial Intelligence

Occupancy mapping has been a key enabler of mobile robotics. Originally based on a discrete grid representation, occupancy mapping has evolved towards continuous representations that can predict the occupancy status at any location and account for occupancy correlations between neighbouring areas. Gaussian Process (GP) approaches treat this task as a binary classification problem using both observations of occupied and free space. Conceptually, a GP latent field is passed through a logistic function to obtain the output class without actually manipulating the GP latent field. In this work, we propose to act directly on the latent function to efficiently integrate free space information as a prior based on the shape of the sensor's field-of-view. A major difference with existing methods is the change in the classification problem, as we distinguish between free and unknown space. The `occupied' area is the infinitesimally thin location where the class transitions from free to unknown. We demonstrate in simulated environments that our approach is sound and leads to competitive reconstruction accuracy.


Categorized Grid and Unknown Space Causes for LiDAR-based Dynamic Occupancy Grids

Jiménez-Bermejo, Víctor, Godoy, Jorge, Artuñedo, Antonio, Villagra, Jorge

arXiv.org Artificial Intelligence

Occupancy Grids have been widely used for perception of the environment as they allow to model the obstacles in the scene, as well as free and unknown space. Recently, there has been a growing interest in the unknown space due to the necessity of better understanding the situation. Although Occupancy Grids have received numerous extensions over the years to address emerging needs, currently, few works go beyond the delimitation of the unknown space area and seek to incorporate additional information. This work builds upon the already well-established LiDAR-based Dynamic Occupancy Grid to introduce a complementary Categorized Grid that conveys its estimation using semantic labels while adding new insights into the possible causes of unknown space. The proposed categorization first divides the space by occupancy and then further categorizes the occupied and unknown space. Occupied space is labeled based on its dynamic state and reliability, while the unknown space is labeled according to its possible causes, whether they stem from the perception system's inherent constraints, limitations induced by the environment, or other causes. The proposed Categorized Grid is showcased in real-world scenarios demonstrating its usefulness for better situation understanding.


PUMA: Fully Decentralized Uncertainty-aware Multiagent Trajectory Planner with Real-time Image Segmentation-based Frame Alignment

Kondo, Kota, Tewari, Claudius T., Peterson, Mason B., Thomas, Annika, Kinnari, Jouko, Tagliabue, Andrea, How, Jonathan P.

arXiv.org Artificial Intelligence

Fully decentralized, multiagent trajectory planners enable complex tasks like search and rescue or package delivery by ensuring safe navigation in unknown environments. However, deconflicting trajectories with other agents and ensuring collision-free paths in a fully decentralized setting is complicated by dynamic elements and localization uncertainty. To this end, this paper presents (1) an uncertainty-aware multiagent trajectory planner and (2) an image segmentation-based frame alignment pipeline. The uncertainty-aware planner propagates uncertainty associated with the future motion of detected obstacles, and by incorporating this propagated uncertainty into optimization constraints, the planner effectively navigates around obstacles. Unlike conventional methods that emphasize explicit obstacle tracking, our approach integrates implicit tracking. Sharing trajectories between agents can cause potential collisions due to frame misalignment. Addressing this, we introduce a novel frame alignment pipeline that rectifies inter-agent frame misalignment. This method leverages a zero-shot image segmentation model for detecting objects in the environment and a data association framework based on geometric consistency for map alignment. Our approach accurately aligns frames with only 0.18 m and 2.7 deg of mean frame alignment error in our most challenging simulation scenario. In addition, we conducted hardware experiments and successfully achieved 0.29 m and 2.59 deg of frame alignment error. Together with the alignment framework, our planner ensures safe navigation in unknown environments and collision avoidance in decentralized settings.


Learning-Augmented Model-Based Planning for Visual Exploration

Li, Yimeng, Debnath, Arnab, Stein, Gregory, Kosecka, Jana

arXiv.org Artificial Intelligence

We consider the problem of time-limited robotic exploration in previously unseen environments where exploration is limited by a predefined amount of time. We propose a novel exploration approach using learning-augmented model-based planning. We generate a set of subgoals associated with frontiers on the current map and derive a Bellman Equation for exploration with these subgoals. Visual sensing and advances in semantic mapping of indoor scenes are exploited for training a deep convolutional neural network to estimate properties associated with each frontier: the expected unobserved area beyond the frontier and the expected timesteps (discretized actions) required to explore it. The proposed model-based planner is guaranteed to explore the whole scene if time permits. We thoroughly evaluate our approach on a large-scale pseudo-realistic indoor dataset (Matterport3D) with the Habitat simulator. We compare our approach with classical and more recent RL-based exploration methods. Our approach surpasses the greedy strategies by 2.1% and the RL-based exploration methods by 8.4% in terms of coverage.


RAMP: A Risk-Aware Mapping and Planning Pipeline for Fast Off-Road Ground Robot Navigation

Sharma, Lakshay, Everett, Michael, Lee, Donggun, Cai, Xiaoyi, Osteen, Philip, How, Jonathan P.

arXiv.org Artificial Intelligence

A key challenge in fast ground robot navigation in 3D terrain is balancing robot speed and safety. Recent work has shown that 2.5D maps (2D representations with additional 3D information) are ideal for real-time safe and fast planning. However, the prevalent approach of generating 2D occupancy grids through raytracing makes the generated map unsafe to plan in, due to inaccurate representation of unknown space. Additionally, existing planners such as MPPI do not consider speeds in known free and unknown space separately, leading to slower overall plans. The RAMP pipeline proposed here solves these issues using new mapping and planning methods. This work first presents ground point inflation with persistent spatial memory as a way to generate accurate occupancy grid maps from classified pointclouds. Then we present an MPPI-based planner with embedded variability in horizon, to maximize speed in known free space while retaining cautionary penetration into unknown space. Finally, we integrate this mapping and planning pipeline with risk constraints arising from 3D terrain, and verify that it enables fast and safe navigation using simulations and hardware demonstrations.


Towards Multi-robot Exploration: A Decentralized Strategy for UAV Forest Exploration

Bartolomei, Luca, Teixeira, Lucas, Chli, Margarita

arXiv.org Artificial Intelligence

Efficient exploration strategies are vital in tasks such as search-and-rescue missions and disaster surveying. Unmanned Aerial Vehicles (UAVs) have become particularly popular in such applications, promising to cover large areas at high speeds. Moreover, with the increasing maturity of onboard UAV perception, research focus has been shifting toward higher-level reasoning for single- and multi-robot missions. However, autonomous navigation and exploration of previously unknown large spaces still constitutes an open challenge, especially when the environment is cluttered and exhibits large and frequent occlusions due to high obstacle density, as is the case of forests. Moreover, the problem of long-distance wireless communication in such scenes can become a limiting factor, especially when automating the navigation of a UAV swarm. In this spirit, this work proposes an exploration strategy that enables UAVs, both individually and in small swarms, to quickly explore complex scenes in a decentralized fashion. By providing the decision-making capabilities to each UAV to switch between different execution modes, the proposed strategy strikes a great balance between cautious exploration of yet completely unknown regions and more aggressive exploration of smaller areas of unknown space. This results in full coverage of forest areas of variable density, consistently faster than the state of the art. Demonstrating successful deployment with a single UAV as well as a swarm of up to three UAVs, this work sets out the basic principles for multi-root exploration of cluttered scenes, with up to 65% speed up in the single UAV case and 40% increase in explored area for the same mission time in multi-UAV setups.


RACER: Rapid Collaborative Exploration with a Decentralized Multi-UAV System

Zhou, Boyu, Xu, Hao, Shen, Shaojie

arXiv.org Artificial Intelligence

Abstract--Although the use of multiple Unmanned Aerial Vehicles (UAVs) has great potential for fast autonomous exploration, it has received far too little attention. To effectively dispatch the UAVs, a pairwise interaction based on an online hgrid space decomposition is used. It ensures that all UAVs simultaneously explore distinct regions, using only asynchronous and limited communication. Further, we optimize the coverage paths of unknown space and balance the workloads partitioned to each UAV with a Capacitated Vehicle Routing Problem(CVRP) formulation. Given the task allocation, each UAV constantly updates the coverage path and incrementally extracts crucial information to support the exploration planning. A hierarchical planner finds exploration paths, refines local viewpoints and generates minimum-time trajectories in sequence to explore the unknown space agilely and safely. The proposed approach is evaluated extensively, showing high exploration efficiency, scalability and robustness to limited communication. Furthermore, for the first time, we achieve fully decentralized collaborative exploration with multiple UAVs in real world. Two quadrotors simultaneously explore a complex unknown environment. It is demonstrated that UAVs are particularly suited to exploring complex environments efficiently, thanks to their the coordination vulnerable and less effective. To improve the agility and flexibility. Secondly, many multi-robot exploration approaches been paid to multi-UAV systems. However, using a fleet solely consider the allocation of frontiers or viewpoints. of UAVs has incredible potential, since it not only enables Because the actual regions explored by each UAV are not faster accomplishment of exploration, but also is more faulttolerant accounted for, the strategies often result in interference among than a single UAV.